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Abstract 

Chemical reaction networks (CRNs) model the behavior of molecules 
in a well-mixed system. The emerging field of molecular programming 
uses CRNs not only as a descriptive tool, but as a programming lan¬ 
guage for chemical computation. Recently, Chen, Doty and Soloveichik 
introduced a new model of chemical kinetics, rate-independent continu¬ 
ous CRNs (CCRNs), to study the chemical computation of continuous 
functions. A fundamental question of a CRN is whether a state of 
the system is reachable through a sequence of reactions in the net¬ 
work. This is known as the reachability problem. In this paper, we 
investigate CCRN-REACH, the reachability problem for this model of 
chemical reaction networks. We show that, for continuous CRNs, con¬ 
structing a path to a state of the network is computable in polynomial 
time. We also prove that a related problem, Sub-CCRN-REACH, is 
NP-complete. 


1 Introduction 

Abstract chemical reaction networks (CRNs) model chemical interactions 
in a well mixed system. Informally, CRNs consist of a finite set of species 
of chemicals (usually written abstractly as capital letters; i.e.. A, B, etc.) 

and a finite set of reactions between these species. A simple example is the 

k 

CRN consisting of species A, B and C, with one reaction 2A + B ^ 2C 
(taking A to be the hydrogen molecule H 2 ., B to be the oxygen molecule 
O 2 , and C to be the water molecule H 2 O, this CRN models the formation 
of water molecules with kinetic rate constant k). CRNs have historically 
been used as a descriptive tool, allowing researchers to formally analyze 
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the behavior of natural chemical systems. However, the field of molecular 
programming has recently brought CRNs to prominence as a programming 
language for chemical computation. Molecular programming, as the name 
suggests, is devoted to engineering complex computational systems from 
molecules. Recent work in this area has come to view abstract CRNs as 
a programming language to engineer “chemical software” um- Exciting 
new developments have shown methods of converting arbitrary chemical 
reaction networks into computation using DNA strands [21 m [19] . Thus the 
programmable power of chemical reaction networks is no longer simply of 
theoretical interest. To achieve the goal of engineering large scale, robust 
chemical computation, tools to analyze CRNs will be vital. 

There are many ways to define the behavior of abstract CRNs, the two 
most prominent being mass action kinetics and stochastic chemical reaction 
networks. Mass action kinetics was the first model to be studied extensively. 
It is a continuous, deterministic model of chemical reaction networks. Mass 
action kinetics is used to study systems with sufficiently large numbers of 
molecules so that the amount of a given molecule can be represented as 
a real-valued concentration. The dynamics of reactions under mass action 
kinetics are governed by deterministic differential equations. However, the 
deterministic mass action model is not well suited if the number of molecules 
of the system is low. Stochastic CRNs are widely used to analyze those sys¬ 
tems with a relatively low number of molecules mm- The stochastic CRN 
model is discrete and non-deterministic. Unlike mass action, the amount 
of each species is represented as a non-negative integer, and the reactions 
of a system are modeled as Markov jump processes |8]. The stochastic 
model is closely related to many well-studied models of computation such 
as Vector Addition Machines [TO], Petri Nets |6] and Population Protocols 
[T] . Recently, Chen, Doty and Soloveichik introduced a new model of chem¬ 
ical kinetics, rate-independent continuous CRNs (CCRNs) |3]. The CCRN 
model is continuous, dealing with real-valued concentrations of species, but, 
unlike the stochastic or mass action models, it is rate-free (reactions do not 
have any associated kinetic rate constant). Chen, Doty and Soloveichik used 
CCRNs to study which real valued functions / : M*' —>■ M were computable 
by a chemical reaction network. By being a rate-free model, it allows for the 
study of the computational power of large chemical systems relying on stoi¬ 
chiometry alone (i.e., without depending on specific rates of the reactions). 
This is important, as rate constants are hard to experimentally determine 
and vary under external factors such as temperature. 

A fundamental question one can ask of a stochastic chemical reaction 
network is whether a particular state is reachable from a starting config¬ 
uration; this is called the reachability problem. The reachability problem 
for stochastic CRNs is equivalent to an important problem in theoretical 
computer science, the Vector Addition System Reachability problem (VAS 
reachability) |5]. The VAS reachability problem was proven to be at least 


2 


EXPSPACE-hard by Lipton in 1976 [2], before it was even proven decid¬ 
able. In 1981, building on the work of Sacerdote and Tenney Mayr 
proved the reachability problem was decidable m- Subsequently, Kosaraju 
m and Lambert [12] gave two additional proofs of the decidability of VAS 
Reachability. However, all proofs that the reachability problem is decidable 
were very difficult, until Leroux m gave a greatly simplified proof. Unfor¬ 
tunately, we still do not know if this problem is decidable in any primitive 
recursive time bound. 

In this paper, we investigate two variants of the reachability problem 
in the context of CCRNs. In section 3, we analyze the complexity of the 
direct analog of the reachability problem for CCRNs, the continuous chemi¬ 
cal reaction network reachability problem (CCRN-REACH). Informally, the 
CCRN-REACH problem is: given a CCRN C and states c and d, output 
a path taking c to d, if one exists. To effectively compute CCRN-REACH, 
we will require the states to be over the rationals instead of over arbitrary 
reals. We show that, contrary to the difficulty of the VAS reachability prob¬ 
lem, CCRN-REACH can be computed in polynomial time. In the process, 
we give new definitions and lemmas which we believe will be useful in fur¬ 
ther investigations of the continuous chemical reaction network model. In 
section 4, we define a problem closely related to the reachability problem, 
called the Sub-CCRN-REACH problem. Sub-CCRN-REACH asks if a path 
exists between two states using at most k of the reactions in the network. 
In contrast to the computational “ease” of CCRN-REACH, we show that 
Sub-CCRN-REACH is NP-complete. 

2 Preliminaries 

Throughout the remainder of this paper || • || will be the max norm. Be¬ 
fore proving the main theorem, we will review preliminary dehnitions and 
notations for continuous CRNs. 

2.1 Rate Independent Continnons CRNs 

A continuous chemical reaction network (CCRN) is a pair C = (A, R), where 
A is a hnite set of species, and i? is a hnite set of reactions over A. We typ¬ 
ically denote species by capital letters, so that A = {A, B, ...}. A reaction 
over the set of species A is an element p = (r, p) € x N^, where r and 
p specify the stoichiometry of the reactants and products, respectively. We 
require the net change Ap = p — r of a reaction p = (r, p) to be nonzero. 
We will usually write a reaction using the “reactants, right arrow, prod¬ 
ucts” notation; for example, p = A + B ^ C (in this example r = (1,1,0) 
and p = (0,0,1)). A reaction p = (r, p) is catalytic if, for some species s, 
r(s) = p(s) 7^ 0 (for example, A + B ^ A + C). In this case, we call the 
species s a catalyst. Each CCRN C = (A, R) has an associated reaction 
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stoichiometry matrix M specifying the net change of each species for every 
reaction. Formally, M is a |A| x \R\ matrix over Z such that M(i,j) is 
the net change of the ith. species for the jth reaction. Note that M does 
not fully specify a CCRN C, since it does not identify catalytic reactions. 
A state of a CCRN C = (A, R) is a vector c € M>q specifying the (non¬ 
negative) concentration of each species. The support of a state c is the set 
supp{c) = {s G A I c(s) > 0} of all species with non-zero concentrations at c. 
The support of a reaction p = (r, p) is the set supp{p) = {s G A | r(s) > 0} 
of all reactants of p. A reaction p = (r, p) G R is applicable at a state c if 
supp{r) C supp{c) (i.e., if the concentration of each reactant is non-zero at 
c). A flux vector of a CCRN C = (A, R) is a vector u G The support 
of a flux vector u is the set supp{n) = {p G R | u(p) > 0}. A flux vector u 
is applicable at a state c if the following conditions hold: 

1. Every p G supp{n) is applicable at c. 

2. c{s) + u(p)Ap(s) > 0 for every s G A. 

peR 

If a flux vector u is applicable at state c, we can apply u to c, resulting in 
the state 


c * u = c -|- ^ u(p)Ap. 
p£R 

Equivalently, c * u = c -|- Mu. A flux vector sequence, U = (ui,..., u^) is 
a tuple of flux vectors. We apply a flux vector sequence U = (ui, ...,Ufc) 
iteratively to a state c. 


C*U = (C* (Ui, . . . ,Ufc_i)) =f=Ufc. 


A flux vector sequence U = (ui,...,Ufc) is applicable at state c if Uj is 
applicable at (c * (ui,... ,Uj_i)) for every 1 < i < k. If c and d are any 
states, we say that d is reachable from c, denoted c — >■* d, if there exists a 
flux vector sequence U applicable at c such that c * U = d. We say that d 
is reachable from c in k steps, denoted c d, if there exists a flux vector 
sequence U = {ui,... ,Uk) applicable at c such that c * U = d. A reaction 
p G R is eventually applicable from c if there exists a state d reachable from 
c so that p is applicable at d. A reaction is permanently inapplicable from 
c if it is not eventually applicable from c. 

The following theorem, proven in [3], will be used in the proof of our 
first main theorem. 

Theorem 0 . If c —>■* d, then c —d where m = |R| is the number of 
reactions. 
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3 The Reachability Problem for Continuous CRNs 

Having defined the relevant concepts for continuous chemical reaction net¬ 
works, we are now able to formally define our problem CCRN-REACH. 

The Continuous CRN Reachability Problem. Given a continuous 
CRN C = (A, R) and two states c, d G output a flux vector sequence 
U such that U is applicable at c and c * U = d, if one exists; output “not 
reachable” otherwise. 

Note that this problem becomes a trivial solution of a system of lin¬ 
ear equations if we drop the requirement that the the flux vector sequence 
must be applicable at c. We will prove that CCRN-REACH is computable 
in polynomial time. Intuitively, the dramatic difference in the computa¬ 
tional difficulty between the VAS reachability problem (known to be at least 
EXPSPACE-hard) and CCRN-REACH is the additional flexibility given by 
the rational valued flux vectors. To compute CCRN-REACH, we show how 
to build a flux vector sequence lead from the starting state to a state of 
maximal support. This is only possible in the CCRN model of chemical 
reaction networks, which allows arbitrarily small additions via flux vectors. 
Once we are in such a maximal state we are able to get to the end state 
with the application of a single flux vector. To formalize this intuition, we 
will introduce several definitions and lemmas. 

Fix a continuous CRN C = (A, R). 

Definition. Let c be a state, and e > 0. We say that a vector u is an 
e-max support flux vector of c if u satisfies the following: 

1. u is a flux vector that is applicable at c. 

2. for every flux vector v applicable at c, supp{c * v) C supp{c * u). 

3. ||u|| < e. 

That is, a vector is an e-max support flux vector of a state c if it is 
applicable at c and maximally increases the support of c while giving at 
most e flux to each reaction. We will show that e-max support flux vectors 
exist for every state and e > 0. 

Let e > 0. We now construct a specific e-max support flux vector of c, 
which we will henceforth call the e-max support flux vector of c. Define Appc 
to be the set of all applicable reactions at c. Let e^ = min{c{s) \ s G supp{c)} 
(the lowest nonzero concentration of any species at state c). Pc = max{l, 
|A/)(s)| : p G Appc}, and 5c,e = |^mm{^,e}. 

Definition. The e-max support flux vector of c is the vector Uc,e de¬ 
fined by 
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for every p ^ R. 


^cAp) 


Sc,e, if p e Appc 
0 , otherwise 


The following lemma shows that Uc,e is a well defined e-max support flux 
vector of c. 


Lemma 1. Let c be a state, and e > 0. Then Uc,e is an e-max support flux 
vector of c. 

When the context is clear, we will refer to Uc,e as the max support flux 
vector. The following observation can be easily seen from the definition of 
the max support flux vector. 


Observation 1. The e-max support flux vector of c, Uc,e, is computable in 
polynomial time in terms of{C,c,e). 

Since Uc,e is a flux vector applicable at c, we are able to discuss the max 
support flux vector of the state (c * Uc,e). For convenience, we will use the 
following notation: 

1 . uj , := Uc,e. 

2 . Up ^ := the e-max support flux vector of the state (c * (Up ^,..., Up p^)). 

It is important to note that the vectors u* ^ are distinct; in fact, the hope is 
that the set of applicable reactions grows with successive applications max 
support flux vectors. 

Definition. Let e > Q, m = \R\-\-1 and 7 = ^- The e-max support flux 
vector sequence of c, denoted Uc,e, is defined to be the sequence 


From Observation 1 it is clear that Uc,e is computable in polynomial 
time in terms of {C,c,e). 


Observation 2. For any state c and any e > 0, the e-max support flux vec¬ 
tor sequence of c is a flux vector sequence that is applicable at c. Moreover, 

m 

II E <7II ^ 

i=l 

Proof. This follows immediately from Lemma 1 and the choice of 7 . □ 

The choice of restricting the length of the flux vector Uc,e to |i?| -|- 1 is 
not arbitrary. We will show that this is all that is required to get to the 
largest possible support of a state. 
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Definition. Let c be a state and e > 0. We say that a state m is an e- 
max support state of c if, for every state d that is reachable from c, 
supp{d) C supp{in). 

Similar to our previous definitions, we now define the e-max support 
state of c to be 


mc,e := (c * Uc,e). 

Lemma 2. If c is a state and e > 0, then mc,e is an e-max support state of 

c. 

Since e was arbitrary in Lemma 2 , we see that for every e, e' > 0, 
supp(mc,e) = supp{m(.^e') ■ Recall that a reaction p is eventually applica¬ 
ble from a state c if p is applicable at some state d that is reachable from 
c. By Lemma 2, a reaction p is eventually applicable from a state c if and 
only if p is applicable at mc,e for any e > 0. This allows us to compute 
all the permanently inapplicable reactions from c, which will be vital in the 
algorithm computing CCRN-REACH. 

Observation 3. The set of all permanently inapplicable reactions from c is 
computable in polynomial time. 

Proof. By Observation 1 we compute the 1-max support state of c, mc,i, 
and eliminate all reactions not applicable at mc,i. □ 

We are now ready to prove our first main theorem. 

Main Theorem 1. CCRN-REACH is computable in polynomial time. 

Proof. Consider the following algorithm (Algorithm 1 below) deciding CCRN- 
REACH. From our previous observations, it is clear that the algorithm runs 
in polynomial time in terms of the input. We now prove that d is reachable 
from c if and only if the above algorithm outputs a flux vector sequence U 
applicable at c such that c * U = d. 

Assume that, on input C = (A, R), c and d, the algorithm outputs a 
sequence of vectors U. Let R be the set of reactions left after exiting the loop 
(necessarily non-empty), and m = |R|. By the choice of e and Observation 
2 , for each p € R, 

m+1 

U*,7(P) < S{p), 

i=l 

where Uc,e = (u^ ..., u™+^) (recall that 7 = 7 ^)- Therefore, the vector 

V = 5 — Uc,e is a flux vector (in fact v is strictly positive). Hence the 
output U = (Uc,e,v) is a flux vector sequence. By Observation 2, Uc,e is 
applicable at c. Upon exiting the loop we are guaranteed that any reactions 
remaining in R must be eventually applicable from c using only the other 
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1: Eliminate from R all permanently inapplicable reactions from c 
2: for each reaction p ^ R do 

3: Compute a vector Fp € Q>q such that c + MFp = d and Fp{p) > 0, 

if one exists 

4: if no such vector exists, eliminate p from R, GOTO 1. 

5: end for 

6: if R = 0, output “not reachable” 

7: otherwise define vector S G Q>q as follows 

7ri 

8: for each p € R, set S'(/9) = ^ ^iip) 

9: Compute e = 

10: Compute the max support flux vector sequence Uc,e 
11: Compute V = 5 — Uc,e 

12: Output (Uc,e, v) (padded with Os for eliminated reactions) 


remaining reactions. Let p G supp{v). Then p R, and so p must be 
eventually applicable from c using only reactions remaining in R. By Lemma 
2, (c * Uc,e) = mc,e is a max support state, therefore p is applicable at 
(c*Uc,e). Since p was arbitrary, v is applicable at (c*Uc,e), and so (Uc,e, v) 
is a flux vector sequence that is applicable at c. Finally, we have 



where M is the stoichiometry matrix of C = (A, R). Therefore if the algo¬ 
rithm outputs a vector sequence, then d is reachable from c. 

For the other direction, assume that d is reachable from c. Then, by 
definition, there is a nonempty subset R' F R such that, for all p G R', 

1. p is eventually applicable from c using only reactions from R', and 

2. there exists a vector Fp such that MRp = d — c and Fp{p) > 0. 
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Therefore the algorithm will exit the loop with R nonempty, and output a 
flux vector sequence (Uc,e, v). As we just shown, (Uc,e, v) applicable at c 
such that c * (Uc,e, v) = d. 

□ 


4 The Subset Reachability Problem 

Define the decision problem Sub-CCRN-REACH as follows. 

The Continuous CRN Subset Reachability Problem. Given a con¬ 
tinuous CRN C = (A, R), states c, d and integer k, accept if and only if 
there exists a path from c to d using only k reactions from R. 

In contrast to the computational ease of CCRN-REACH, we give evi¬ 
dence that the related problem Sub-CCRN-REACH is quite difficult. 

Main Theorem 2. Sub-CCRN-REACH is NP-complete. 

Proof. From the proof that CCRN-REACH is computable in polynomial 
time, it is easy to see that Sub-CCRN-REACH is in NP. Simply guess a 
subset of k reactions and decide CCRN-REACH on the subset. We will 
reduce 3SAT to Sub-CCRN-REACH to show hardness. 

Let (j) he a boolean formula on n variables xi,...,Xn with m clauses 
Cl,..., Cm- Construct an equivalent CCRN C^ = (A,^, R^) and states c^, 
as follows. 

For each Xi, define three species Si, Si and sj, and the following four 
reactions, where 0 is a null species (the reactants are being consumed without 
generating any products). 

1. Si —y Si 2. Si —y Si 3. Si —y 0 4. Si —y 0, 

For each clause Cj define one species Tj, and the following (catalytic) 
reactions 


1. Si ^ Si-\-Tj, for every Xi in Cj 2. s j —>■ s j -|- Tj, for every Xi in Cj 


Define the start state to have a concentration of 1 for each species Si 
and a concentration of 0 for all other species. Define the end state dj, to 
have a concentration of 1 for each species Tj, and a concentration of 0 for 
all other species. 

We now show that (f G 3SAT if and only if Cj, —)■* d^ using exactly 
2n -|- m reactions. Assume cj) G 3SAT, let x be any satisfying assignment. 
Define the flux vector ui by, 


ui(p) 


1 if p = —)■ Sj and x(xi) = 1 

< 1 if p = Si ^ Si and x(xi) = 0 
0 otherwise. 

\ 
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The flux vector ui transfers all of the concentration of Si into either Si or Sj, 
depending on the satisfying assignment x. The number of reactions given 
positive flux in ui is n. For each clause Cj, choose one variable Xi. or its 
negation that evaluates to true under x. Define the flux vector U 2 by, 


U2(/5) 


1 if p = Sij —)■ Sij + Tj and Xij is the chosen variable from Cj 
< 1 if p = sij —>■ s~ij + Tj and xlj is the chosen variable from Cj 
0 otherwise. 

\ 


Therefore U 2 only gives positive flux to m reactions, one for each clause in 
(j). Finally, define the flux vector U 3 by. 


U3(/3) 


1 if p = Sj — 7 > 0 and x(xi) = 1 
< 1 if p = Sj — 7 > 0 and x(xi) = 0 
0 otherwise. 

\ 


The flux vector U 3 eliminates the concentrations of each species Si or Si (only 
one of which has concentration 1). Clearly U 3 gives positive flux to only n 
reactions. Hence U = (ui, U2, U3) gives positive flux to only 2n + m distinct 
reactions, and * U = 

Assume * U = Since Si has concentration 0 at d at least one of 
the reactions Si —>■ Si, Si Si must be used, that is, at least n reactions. 
Similarly, since Sj and Sj have concentration 0 at d,^, at lest n must be used 
in any flux vector sequence. Since Tj has concentration 1 at dj, at least 
one reaction of the form Si ^ + Tj or Sj —>■ Sj + Tj must be used for 

each Tj, so at least m reactions. Hence U must give positive flux to at least 
2n + m reactions. In order to reach d using the minimal number of reactions, 
2n + m, U must only give flux to one of Si Si or 5* —s*. Let x be the 
assignment of the variables (xi,..., Xn) given by 


x(xi) 


1 ifV{Si^Si) = l 
0 otherwise. 


Since * U = d,^, U gives positive flux to Sj —>■ s* + Tj or Si ^ Si + Tj 
for each species Tj and some i. Therefore each clause Cj must be satisfiable 
under assignment x. Hence, if —>■* d,^ using exactly 2n + m reactions, 
then (p G 3SAT. □ 
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5 Appendix 

Proof of Lemma 1. Eirst we show Uc,e is a flux vector that is applicable 
at c. It is clear that Uc^e is a flux vector. It suffices to show that every 
reaction p G supp(uc^e) is applicable at c and that (c * Uc,e) G K>o. Erom 
the definition of Uc^e, Uc,e(/o) > 0 if and only if p G Appc- Therefore if 
p G supp{vLc^e)-, then p is applicable at c. To complete the proof of item (1) 
we show c*Uc,e remains non-negative. Let s G A be any species, and assume 
that the concentration of s at c is greater than 0, i.e., s G supp{c). By the 
definition of Uc^e, 

I X] Uc,e(7')A/9(s)| < (5c,e|7?|rc < ^ 

peAppc 
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and therefore. 


(C * Uc,e)(s) = C(s) + ^ Uc,e(/9)Ap(s) 
pGR 

>c(s)-| Uc,,(/))A/)(s)| 

peAppc 



> 0 . 


Hence for every species s G supp{c), (c * Uc,e)(s) > 0. Now assume s ^ 
supp{c)] the concentration of s at c is 0 . As we have seen, the only reactions 
p such that Uc,e(/ 9 ) > 0 are those reactions which are applicable at c. By our 
assumption c(s) = 0 , any applicable reaction p at c must have A/ 9 (s) > 0 . 
It is therefore clear that 


(c * Uc,e)(s) > 0. 

We now prove that, for every flux vector v applicable at c, supp{c * 
v) C supp{c * u). Let s G supp{c * v). We first assume that s G supp{c). 
We showed previously that if s G supp{c), then (c * Uc,e)(s) > 0. Hence 
s G supp{c * Uc,e). Now assume that s ^ supp{c). As the concentration 
of s at c is 0 , any applicable reaction p at c must have Ap(s) > 0 . Since 
s G supp{c * v), there must be at least one reaction ps applicable at c such 
that Aps{s) > 0. Since ps is applicable at c, Uc,e(/9s) = (5c,e- Thus 

(c * Uc,e)(s) = c(s) + Y Uc,e(/9)A/9(s) 
pGR 

= I Y W,£(/3)A/9(s)| 

p&Appc 

> Uc,e{Ps)Aps{s) 

> Sc,e 

> 0 . 

Finally, it is immediate that ||uc,e|| < e. 

Proof of Lemma 2. Let d be a state reachable from c. By Theorem 0, 
there exists a flux vector sequence of length m = |i?| + 1 taking c to d, i.e. 
c d. By induction and use of Lemma 1, we see that for every state d 
such that c d, supp{d) C supp{inc,e)- 
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